Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes
نویسندگان
چکیده
In this paper we present the Slim-tree, a dynamic tree for organizing metric datasets in pages of fixed size. The Slim-tree uses the "fat-factor" which provides a simple way to quantify the degree of overlap between the nodes in a metric tree. It is well-known that the degree of overlap directly affects the query performance of index structures. There are many suggestions to reduce overlap in multi-dimensional index structures, but the Slim-tree is the first metric structure explicitly designed to reduce the degree of overlap. Moreover, we present new algorithms for inserting objects and splitting nodes. The new insertion algorithm leads to a tree with high storage utilization and improved query performance, whereas the new split algorithm runs considerably faster than previous ones, generally without sacrificing search performance. Results obtained from experiments with realworld data sets show that the new algorithms of the Slim-tree consistently lead to performance improvements. For range queries, we observed improvements up to a factor of 35%.
منابع مشابه
Fast Indexing and Visualization of Metric Data Sets using Slim-Trees
ÐMany recent database applications must deal with similarity queries. For such applications, it is important to measure the similarity between two objects using the distance between them. Focusing on this problem, this paper proposes the Slim-tree, a new dynamic tree for organizing metric data sets in pages of fixed size. The Slim-tree uses the triangle inequality to prune distance calculations...
متن کاملDBM-Tree: A Dynamic Metric Access Method Sensitive to Local Density Data
Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbor queries. Current methods improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height-balanced trees). The Slim-tree and the M-tree are the most efficient dynamic MAM so far. However...
متن کاملOn QoS Multicasting Performance in Wide Area Networks
| Multicasting enables applications to scale to a large number of users without overloading the network and server resources. With the advent of multimedia applications , the focus of multicasting research has shifted from minimizing the overall cost of the multicast tree to nding one which supports the QoS requirements of the underlying multimedia application. Finding such a tree, however, is ...
متن کاملLimit Theorems for Sequences of Random Trees
We consider a random tree and introduce a metric in the space of trees to define the “mean tree” as the tree minimizing the average distance to the random tree. When the resulting metric space is compact we show laws of large numbers and central limit theorems for sequence of independent identically distributed random trees. As application we propose tests to check if two samples of random tree...
متن کاملBranches in random recursive k-ary trees
In this paper, using generalized {polya} urn models we find the expected value of the size of a branch in recursive $k$-ary trees. We also find the expectation of the number of nodes of a given outdegree in a branch of such trees.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000